19 research outputs found
Importance of Aligning Training Strategy with Evaluation for Diffusion Models in 3D Multiclass Segmentation
Recently, denoising diffusion probabilistic models (DDPM) have been applied
to image segmentation by generating segmentation masks conditioned on images,
while the applications were mainly limited to 2D networks without exploiting
potential benefits from the 3D formulation. In this work, we studied the
DDPM-based segmentation model for 3D multiclass segmentation on two large
multiclass data sets (prostate MR and abdominal CT). We observed that the
difference between training and test methods led to inferior performance for
existing DDPM methods. To mitigate the inconsistency, we proposed a recycling
method which generated corrupted masks based on the model's prediction at a
previous time step instead of using ground truth. The proposed method achieved
statistically significantly improved performance compared to existing DDPMs,
independent of a number of other techniques for reducing train-test
discrepancy, including performing mask prediction, using Dice loss, and
reducing the number of diffusion time steps during training. The performance of
diffusion models was also competitive and visually similar to
non-diffusion-based U-net, within the same compute budget. The JAX-based
diffusion framework has been released at
https://github.com/mathpluscode/ImgX-DiffSeg.Comment: Accepted at Deep Generative Models workshop at MICCAI 202
A Recycling Training Strategy for Medical Image Segmentation with Diffusion Denoising Models
Denoising diffusion models have found applications in image segmentation by
generating segmented masks conditioned on images. Existing studies
predominantly focus on adjusting model architecture or improving inference,
such as test-time sampling strategies. In this work, we focus on improving the
training strategy and propose a novel recycling method. During each training
step, a segmentation mask is first predicted given an image and a random noise.
This predicted mask, which replaces the conventional ground truth mask, is used
for denoising task during training. This approach can be interpreted as
aligning the training strategy with inference by eliminating the dependence on
ground truth masks for generating noisy samples. Our proposed method
significantly outperforms standard diffusion training, self-conditioning, and
existing recycling strategies across multiple medical imaging data sets: muscle
ultrasound, abdominal CT, prostate MR, and brain MR. This holds for two widely
adopted sampling strategies: denoising diffusion probabilistic model and
denoising diffusion implicit model. Importantly, existing diffusion models
often display a declining or unstable performance during inference, whereas our
novel recycling consistently enhances or maintains performance. We show that,
under a fair comparison with the same network architectures and computing
budget, the proposed recycling-based diffusion models achieved on-par
performance with non-diffusion-based supervised training. By ensembling the
proposed diffusion and the non-diffusion models, significant improvements to
the non-diffusion models have been observed across all applications,
demonstrating the value of this novel training method. This paper summarizes
these quantitative results and discusses their values, with a fully
reproducible JAX-based implementation, released at
https://github.com/mathpluscode/ImgX-DiffSeg.Comment: Accepted for publication at the Journal of Machine Learning for
Biomedical Imaging (MELBA) https://melba-journal.org/2023:01
Cross-Modality Image Registration using a Training-Time Privileged Third Modality
— In this work, we consider the task of pairwise cross-modality image registration, which may benefit
from exploiting additional images available only at training
time from an additional modality that is different to those
being registered. As an example, we focus on aligning
intra-subject multiparametric Magnetic Resonance (mpMR)
images, between T2-weighted (T2w) scans and diffusionweighted scans with high b-value (DWI_{high−b}). For the application of localising tumours in mpMR images, diffusion
scans with zero b-value (DWI_{b=0}) are considered easier to
register to T2w due to the availability of corresponding
features. We propose a learning from privileged modality
algorithm, using a training-only imaging modality DWIb=0,
to support the challenging multi-modality registration problems. We present experimental results based on 369 sets of
3D multiparametric MRI images from 356 prostate cancer
patients and report, with statistical significance, a lowered
median target registration error of 4.34 mm, when registering the holdout DWI_{high−b} and T2w image pairs, compared
with that of 7.96 mm before registration. Results also show
that the proposed learning-based registration networks enabled efficient registration with comparable or better accuracy, compared with a classical iterative algorithm and
other tested learning-based methods with/without the additional modality. These compared algorithms also failed
to produce any significantly improved alignment between
DWI_{high−b} and T2w in this challenging application
FEW-SHOT image segmentation for cross-institution male pelvic organs using registration-assisted prototypical learning
The ability to adapt medical image segmentation networks for a novel class such as an unseen anatomical or pathological structure, when only a few labelled examples of this class are available from local healthcare providers, is sought-after. This potentially addresses two widely recognised limitations in deploying modern deep learning models to clinical practice, expertise-and-labour-intensive labelling and cross-institution generalisation. This work presents the first 3D few-shot interclass segmentation network for medical images, using a labelled multi-institution dataset from prostate cancer patients with eight regions of interest. We propose an image alignment module registering the predicted segmentation of both query and support data, in a standard prototypical learning algorithm, to a reference atlas space. The built-in registration mechanism can effectively utilise the prior knowledge of consistent anatomy between subjects, regardless whether they are from the same institution or not. Experimental results demonstrated that the proposed registration-assisted prototypical learning significantly improved segmentation accuracy (p-values<0.01) on query data from a holdout institution, with varying availability of support data from multiple institutions. We also report the additional benefits of the proposed 3D networks with 75% fewer parameters and an arguably simpler implementation, compared with existing 2D few-shot approaches that segment 2D slices of volumetric medical images
Few-shot image segmentation for cross-institution male pelvic organs using registration-assisted prototypical learning
The ability to adapt medical image segmentation networks for a novel class
such as an unseen anatomical or pathological structure, when only a few
labelled examples of this class are available from local healthcare providers,
is sought-after. This potentially addresses two widely recognised limitations
in deploying modern deep learning models to clinical practice,
expertise-and-labour-intensive labelling and cross-institution generalisation.
This work presents the first 3D few-shot interclass segmentation network for
medical images, using a labelled multi-institution dataset from prostate cancer
patients with eight regions of interest. We propose an image alignment module
registering the predicted segmentation of both query and support data, in a
standard prototypical learning algorithm, to a reference atlas space. The
built-in registration mechanism can effectively utilise the prior knowledge of
consistent anatomy between subjects, regardless whether they are from the same
institution or not. Experimental results demonstrated that the proposed
registration-assisted prototypical learning significantly improved segmentation
accuracy (p-values<0.01) on query data from a holdout institution, with varying
availability of support data from multiple institutions. We also report the
additional benefits of the proposed 3D networks with 75% fewer parameters and
an arguably simpler implementation, compared with existing 2D few-shot
approaches that segment 2D slices of volumetric medical images
Image quality assessment for machine learning tasks using meta-reinforcement learning
In this paper, we consider image quality assessment (IQA) as a measure of how images are amenable with respect to a given downstream task, or task amenability. When the task is performed using machine learning algorithms, such as a neural-network-based task predictor for image classification or segmentation, the performance of the task predictor provides an objective estimate of task amenability. In this work, we use an IQA controller to predict the task amenability which, itself being parameterised by neural networks, can be trained simultaneously with the task predictor. We further develop a meta-reinforcement learning framework to improve the adaptability for both IQA controllers and task predictors, such that they can be fine-tuned efficiently on new datasets or meta-tasks. We demonstrate the efficacy of the proposed task-specific, adaptable IQA approach, using two clinical applications for ultrasound-guided prostate intervention and pneumonia detection on X-ray images
Image quality assessment by overlapping task-specific and task-agnostic measures: application to prostate multiparametric MR images for cancer segmentation
Image quality assessment (IQA) in medical imaging can be used to ensure that
downstream clinical tasks can be reliably performed. Quantifying the impact of
an image on the specific target tasks, also named as task amenability, is
needed. A task-specific IQA has recently been proposed to learn an
image-amenability-predicting controller simultaneously with a target task
predictor. This allows for the trained IQA controller to measure the impact an
image has on the target task performance, when this task is performed using the
predictor, e.g. segmentation and classification neural networks in modern
clinical applications. In this work, we propose an extension to this
task-specific IQA approach, by adding a task-agnostic IQA based on
auto-encoding as the target task. Analysing the intersection between
low-quality images, deemed by both the task-specific and task-agnostic IQA, may
help to differentiate the underpinning factors that caused the poor target task
performance. For example, common imaging artefacts may not adversely affect the
target task, which would lead to a low task-agnostic quality and a high
task-specific quality, whilst individual cases considered clinically
challenging, which can not be improved by better imaging equipment or
protocols, is likely to result in a high task-agnostic quality but a low
task-specific quality. We first describe a flexible reward shaping strategy
which allows for the adjustment of weighting between task-agnostic and
task-specific quality scoring. Furthermore, we evaluate the proposed algorithm
using a clinically challenging target task of prostate tumour segmentation on
multiparametric magnetic resonance (mpMR) images, from 850 patients. The
proposed reward shaping strategy, with appropriately weighted task-specific and
task-agnostic qualities, successfully identified samples that need
re-acquisition due to defected imaging process.Comment: Accepted for publication at the Journal of Machine Learning for
Biomedical Imaging (MELBA) https://www.melba-journal.or
Prototypical few-shot segmentation for cross-institution male pelvic structures with spatial registration
The prowess that makes few-shot learning desirable in medical image analysis is the
efficient use of the support image data, which are labelled to classify or segment new
classes, a task that otherwise requires substantially more training images and expert
annotations. This work describes a fully 3D prototypical few-shot segmentation algorithm, such that the trained networks can be effectively adapted to clinically interesting
structures that are absent in training, using only a few labelled images from a different
institute. First, to compensate for the widely recognised spatial variability between institutions in episodic adaptation of novel classes, a novel spatial registration mechanism
is integrated into prototypical learning, consisting of a segmentation head and an spatial alignment module. Second, to assist the training with observed imperfect alignment,
support mask conditioning module is proposed to further utilise the annotation available
from the support images. Extensive experiments are presented in an application of segmenting eight anatomical structures important for interventional planning, using a data
set of 589 pelvic T2-weighted MR images, acquired at seven institutes. The results
demonstrate the efficacy in each of the 3D formulation, the spatial registration, and the
support mask conditioning, all of which made positive contributions independently or
collectively. Compared with the previously proposed 2D alternatives, the few-shot segmentation performance was improved with statistical significance, regardless whether
the support data come from the same or different institutes
Prototypical few-shot segmentation for cross-institution male pelvic structures with spatial registration
The prowess that makes few-shot learning desirable in medical image analysis
is the efficient use of the support image data, which are labelled to classify
or segment new classes, a task that otherwise requires substantially more
training images and expert annotations. This work describes a fully 3D
prototypical few-shot segmentation algorithm, such that the trained networks
can be effectively adapted to clinically interesting structures that are absent
in training, using only a few labelled images from a different institute.
First, to compensate for the widely recognised spatial variability between
institutions in episodic adaptation of novel classes, a novel spatial
registration mechanism is integrated into prototypical learning, consisting of
a segmentation head and an spatial alignment module. Second, to assist the
training with observed imperfect alignment, support mask conditioning module is
proposed to further utilise the annotation available from the support images.
Extensive experiments are presented in an application of segmenting eight
anatomical structures important for interventional planning, using a data set
of 589 pelvic T2-weighted MR images, acquired at seven institutes. The results
demonstrate the efficacy in each of the 3D formulation, the spatial
registration, and the support mask conditioning, all of which made positive
contributions independently or collectively. Compared with the previously
proposed 2D alternatives, the few-shot segmentation performance was improved
with statistical significance, regardless whether the support data come from
the same or different institutes.Comment: accepted by Medical Image Analysi